首页> 外文OA文献 >A Link Analysis Extension of Correspondence Analysis for Mining Relational Databases
【2h】

A Link Analysis Extension of Correspondence Analysis for Mining Relational Databases

机译:关联关系数据库关联分析的链接分析扩展

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This work introduces a link analysis procedure for discovering relationships in a relational database or a graph, generalizing both simple and multiple correspondence analysis. It is based on a random walk model through the database defining a Markov chain having as many states as elements in the database. Suppose we are interested in analyzing the relationships between some elements (or records) contained in two different tables of the relational database. To this end, in a first step, a reduced, much smaller, Markov chain containing only the elements of interest and preserving the main characteristics of the initial chain, is extracted by stochastic complementation [41]. This reduced chain is then analyzed by projecting jointly the elements of interest in the diffusion map subspace [42] and visualizing the results. This two-step procedure reduces to simple correspondence analysis when only two tables are defined,and to multiple correspondence analysis when the database takes the form of a simple star-schema. On the other hand, a kernel version of the diffusion map distance, generalizing the basic diffusion map distance to directed graphs, is also introduced and the links with spectral clustering are discussed. Several data sets are analyzed by using the proposed methodology, showing the usefulness of the technique for extracting relationships in relational databases or graphs.
机译:这项工作介绍了一种链接分析过程,用于发现关系数据库或图形中的关系,从而对简单和多重对应分析进行了概括。它基于通过数据库的随机游走模型,该模型定义了一个马尔可夫链,其状态与数据库中的元素数量一样多。假设我们对分析关系数据库的两个不同表中包含的某些元素(或记录)之间的关系感兴趣。为此,在第一步中,通过随机互补[41]提取了仅包含目标元素并保留初始链主要特征的简化的,较小的马尔可夫链。然后通过共同投影扩散图子空间[42]中感兴趣的元素并可视化结果来分析此简化的链。当只定义两个表时,此两步过程简化为简单的对应关系分析,而当数据库采用简单的星形模式时,则简化为多重对应关系分析。另一方面,还介绍了扩散图距离的内核版本,将基本的扩散图距离概括为有向图,并讨论了具有光谱聚类的链接。使用提出的方法对几个数据集进行了分析,显示了该技术在关系数据库或图形中提取关系的有用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号